CAPRI: A Common Architecture for Autonomous, Distributed Diagnosis of Internet Faults using Probabilistic Relational Models
نویسنده
چکیده
Internet fault diagnosis today is slow, costly, and error-prone because it requires humans to run diagnostic tests and interpret their results. A fully autonomous self-diagnosing network could greatly improve diagnostic accuracy and efficiency, but such a network requires a common language for expressing diagnostic knowledge and data, and a protocol for distributed probabilistic diagnostic reasoning. In this paper I show how the Common Architecture for Probabilistic Reasoning in the Internet (CAPRI) can satisfy these requirements using probabilistic relational models (PRMs). Preliminary results indicate that CAPRI agents can diagnose HTTP proxy connection failures with over 80% accuracy using TCP failure data collected using an updated version of Planetseer[11].
منابع مشابه
Public Review: CAPRI: A common architecture for autonomous, distributed diagnosis of Internet faults using probabilistic relational models
Traditionally, Internet research has focused on usability, such as reliability and performance, while overlooking network manageabiltiy. However, as the Internet has undergone an exponential growth in recent years, so has its complexity. For example, a typical end-to-end communication has to rely on the correct functioning of many network components, such as firewalls, proxies, DNS systems, rou...
متن کاملCommon architecture for distributed probabilistic Internet fault diagnosis
This thesis presents a new approach to root cause localization and fault diagnosis in the Internet based on a Common Architecture for Probabilistic Reasoning in the Internet (CAPRI) in which distributed, heterogeneous diagnostic agents efficiently conduct diagnostic tests and communicate observations, beliefs, and knowledge to probabilistically infer the cause of network failures. Unlike previo...
متن کاملProbabilistic Models for Monitoring and Fault Diagnosis
Reliably detecting and diagnosing faults is very important for autonomous systems. The problem is made difficult due to the large number of faults that can occur and the fact that most faults cannot be observed directly, but must be inferred from noisy sensor readings. Probabilistic models, such as Partially Observable Markov Decision Processes (POMDPs), are a natural representation for trackin...
متن کاملInductive learning for fault diagnosis
There is a steadily increasing need for autonomous systems that must be able to function with minimal human intervention to detect and isolate faults, and recover from such faults. In this paper we present a novel hybrid Model based and Data Clustering (MDC) architecture for fault monitoring and diagnosis, which is suitable for complex dynamic systems with continuous and discrete variables. The...
متن کاملRobust FDI for FTC Coordination in a Distributed Network System
This paper focuses on the development of a suitable Fault Detection and Isolation (FDI) strategy for application to a system of inter-connected and distributed systems, as a basis for a fault-tolerant Network Control System (NCS) problem. The work follows a recent study showing that a hierarchical decentralized control system architecture may be suitable for fault-tolerant control (FTC) of a ne...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006